Understanding who talks about what: comparison between the information treatment in traditional media and online discussions

We study the dynamics of interactions between a traditional medium, the New York Times journal, and its followers in Twitter, using a massive dataset. It consists of the metadata of the articles published by the journal during the first year of the COVID-19 pandemic, and the posts published in Twitter by a large set of followers of the @nytimes account along with those published by a set of followers of several other media of different kind. The dynamics of discussions held in Twitter by exclusive followers of a medium show a strong dependence on the medium they follow: the followers of @FoxNews show the highest similarity to each other and a strong differentiation of interests with the general group. Our results also reveal the difference in the attention payed to U.S. presidential elections by the journal and by its followers, and show that the topic related to the “Black Lives Matter” movement started in Twitter, and was addressed later by the journal.

FIG. S1: Delay times ∆t between @nytimes posting a link to one of their articles and reactions of their followers. The dark line labeled link tracks the delay between publication of an article on the website and the appearance of tweets from @nytimes followers containing a link to it. We show this measurement conditioned on the section of the NYT in which the corresponding articles appeared. Similarity @nytimes excl. @FoxNews excl. all × @FoxNews excl. @nytimes @FoxNews all FIG. S2: Dynamics of self-and cross-similarities of different subpopulations corresponding to the followers of different media accounts in Twitter, recomputed after removing the #endsars topic from the topic vectors. In this way we identify the origin of the very high peak in October 2020 shown in Fig.7 of the main text. For clarity we concentrate on the curves involving the followers of @nytimes and @FoxNews, along with a randomized sample, that gathers followers of all media together (labelled "all"). The labels '@nytimes excl.' and '@FoxNews excl.' refer to the subpopulations that only follow the cited media. 'all × @FoxNews excl.' is the cross similarity between the exclusive followers of @FoxNews in our dataset and all users (including the followers of @FoxNews) in our dataset. This peak is caused by many hashtag usages by a relative low number of users, such that it does not appear in the entropy which only considers usages by unique users, stressing the importance to analyse data using different indicators. For comparison, each user, who used #endsars, did so on average 14.1 times, while each user who tweeted #blacklivesmatter, did so on average only 3.3 times, such that the latter has a much broader support, and therefore causes a stronger signal in Fig.1 of the main text.
On the contrary, the similarity takes into account the number of usages of a topic by each user. Therefore those using #endsars have the corresponding component of their topic vector, much larger than the others, such that it is almost aligned with the #endsars direction, giving rise to the strong increase in similarity. We could identify that more than ≈ 40% of the #endsars hashtags were tweeted by users, who specified "Nigeria" as (part of) their location. Since most users do not specify any location, we conjecture that the hashtag was mainly used by users in Nigeria, who were captured because they follow the most popular media in U.S.
In Figure S3 we present the dynamics of the self-similarities for exclusive (green) and non-exclusive (violet) followers of the different news agencies.
The behaviour of self-similarities of exclusive followers depend on the media they follow. For the exclusive followers of @TIME, @FoxNews and @WSJ the self-similarity tends to be higher than for their general followers, with the exception of the beginning of the COVID-19 pandemics, between March and June.
On the contrary, from June onwards, the dynamics of @nytimes, @CNN, @AP and @washingtonpost exclusive and non-exclusive followers are quite similar within each media, showing also similar peaks.
An interesting peak is also observed at the beginning of February for exclusive followers of @AP and @FoxNews, a signal that is absent of the similarities of the exclusive followers of other media.

III. MAIN TOPICS DISCUSSED BY @FOXNEWS FOLLOWERS
In order to complement Fig. 2 in the main text showing the dynamics of the largest 8 topics discussed by the @nytimes followers, Fig. S4 shows the analogous time dynamics of the main topics discussed by exclusive and nonexclusive @FoxNews followers in Twitter.
The most active topic for @FoxNews followers is related to the presidential elections, while the coronavirus pandemics is at a second place, as opposed to the behavior of @nytimes followers in Fig. 2 of the main text.
The treatment of Black Lives Matter is also fundamentally different between @nytimes and @FoxNews followers, being the most discussed topics among the former during its peak, and being strongly associated to "content marketing strategies" among the latter. Similarity @FoxNews @FoxNews excl.
FIG. S3: Self-similarities conditioned on which media outlet the users are following (excluding the #endsars topic).  Table S1 lists the main hashtags associated to the most frequent topics discussed by the NYT followers. While, as expected, the most used topic over the period is the one related to COVID-19 pandemics, other important topics follow important events of the period like the BlackLivesMatter protests and the US elections. Notice that other  hashtags evoking the pandemic (and also elections) are part of different topics. This is related to the fact that both the pandemic and the elections intervene in several aspects of public discussion and the method is able to detect so.

V. NETWORK VISUALIZATION
In Figure S5 we offer a visualization of part of the semantic network of hashtags, including the 1.5% most frequent hashtag pairs. Here, the size of a node (hashtag) represents the number of hashtags that it is linked to. Colors represent the community structure found by Infomap [1].
#covid and #coronavirus, in the pink community, stand out as the most connected hashtags in the network. Other discussion topics are related to the elections (gray), cryptocurrencies (violet) and technology (light blue).